Robust feature extraction using subband spectral centroid histograms
نویسندگان
چکیده
In this paper we propose a new framework for utilizing frequency information from the short-term power spectrum of speech. Feature extraction is based on the cepstral coefficients derived from the histograms of subband spectral centroids (SSC). Two new feature extraction algorithms are proposed, one based on frequency information alone, and the other which efficiently combines the frequency and amplitude information from the speech power spectrum. Experimental study on an automatic speech recognition task has shown that the proposed methods outperform the conventional speech front-ends in presence of additive white noise, while they perform comparably in the noise-free conditions.
منابع مشابه
Robust parameters for speech recognition based on subband spectral centroid histograms
In this paper we propose a new speech parameterization framework that efficiently combines frequency and magnitude information from the short-term power spectrum of speech. This is achieved through computation of subband spectral centroid histograms (SSCH). Relationship between the proposed method and auditory based speech parameterization methods is discussed. An experimental study on an autom...
متن کاملA Robust Front-End Processor combining Mel Frequency Cepstral Coefficient and Sub-band Spectral Centroid Histogram methods for Automatic Speech Recognition
Environmental robustness is an important area of research in speech recognition. Mismatch between trained speech models and actual speech to be recognized is due to factors like background noise. It can cause severe degradation in the accuracy of recognizers which are based on commonly used features like mel-frequency cepstral co-efficient (MFCC) and linear predictive coding (LPC). It is well u...
متن کاملInvestigation of Spectral Centroid Magnitude and Frequency for Speaker Recognition
Most conventional features used in speaker recognition are based on spectral envelope characterizations such as Mel-scale filterbank cepstrum coefficients (MFCC), Linear Prediction Cepstrum Coefficient (LPCC) and Perceptual Linear Prediction (PLP). The MFCC’s success has seen it become a de facto standard feature for speaker recognition. Alternative features, that convey information other than ...
متن کاملSpectral Subband Centroids as Complementary Features for Speaker Authentication
Most conventional features used in speaker authentication are based on estimation of spectral envelopes in one way or another, e.g., Mel-scale Filterbank Cepstrum Coefficients (MFCCs), Linear-scale Filterbank Cepstrum Coefficients (LFCCs) and Relative Spectral Perceptual Linear Prediction (RASTA-PLP). In this study, Spectral Subband Centroids (SSCs) are examined. These features are the centroid...
متن کاملStress level detection using double-layer subband filter
Stress level detection is important for human error prevention and health care services. Speech based stress level detection is the most effective as speech data can be obtained in nonintrusive and inexpensive ways. In this paper, we explore the features that use Double-Layered Subband (DLS) filter for detecting stress level from speech. Spectral Centroid Frequency (SCF) and Spectral Centroid A...
متن کامل